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Abstract. A subset A of N is called an IP-set if A contains all finite sums of distinct terms of 
some infinite sequence (x n ) ne ^ of natural numbers. Central sets, first introduced by Furstenberg 
using notions from topological dynamics, constitute a special class of IP-sets possessing additional 
nice combinatorial properties: Each central set contains arbitrarily long arithmetic progressions, and 
solutions to all partition regular systems of homogeneous linear equations. In this paper we show 
how certain families of aperiodic words of low factor complexity may be used to generate a wide 
assortment of central sets having additional nice properties inherited from the rich combinatorial 
structure of the underlying word. We consider Sturmian words and their extensions to higher al- 
phabets (so-called Arnoux-Rauzy words), as well as words generated by substitution rules including 
the famous Thue-Morse word. We also describe a connection between central sets and the strong 
coincidence condition for fixed points of primitive substitutions which represents a new approach to 
the strong coincidence conjecture for irreducible Pisot substitutions. Our methods simultaneously 
exploit the general theory of combinatorics on words, the arithmetic properties of abstract numera- 
tion systems defined by substitution rules, notions from topological dynamics including proximality 
and equicontinuity, the spectral theory of symbolic dynamical systems, and the beautiful and elegant 
theory, developed by N. Hindman, D. Strauss and others, linking IP-sets to the algebraic/topological 
properties of the Stone-Cech compactification of N. Using the key notion of p-\im n , regarded as a 
mapping from words to words, we apply ideas from combinatorics on words in the framework of 
ultrafilters. 
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1. Introduction 

Let N = {0, 1, 2, 3, . . .} denote the set of natural numbers, and Fin(N) the set of all non-empty 
finite subsets of N. 

Definition 1.1. A subset AofN is called an IP-set if A contains {^2 neF x n \ F £ Fin(N)} for 
some infinite sequence of natural numbers x$ < X\ < X2 • • • ■ A subset A C N is called an IP* -set 
if An B ^ for every IP -set B C N. 

By a celebrated result of N. Hindman [23], given any finite partition of N, at least one element of 
the partition is an IP-set. It follows from Hindman's theorem that every IP*-set is an IP-set, but the 
converse is in general not true. In fact, more generally Hindman shows that given any finite parti- 
tion of an IP-set, at least one element of the partition is again an IP-set. In other words the property 
of being an IP-set is partition regular, i.e., cannot be destroyed via a finite partitioning. Other 
examples of partition regularity are given by the pigeonhole principle, sets having positive upper 
density, and sets having arbitrarily long arithmetic progressions (Van der Waerden's theorem). In 
E2l . Furstenberg introduced a special class of IP-sets, called central sets, having a substantial 
combinatorial structure. The property of being central is also partition regular. Central sets were 
originally defined in terms of topological dynamics: 

Definition 1.2. A subset AcNis called central if there exists a compact metric space (X, d) and 
a continuous map T : X — > X, points x, y £ X and a neighborhood U ofy such that 

• y is a uniformly recurrent point in X, 

• x and y are proximal, 

• A = {n £ N|T n (x) £ U}. 

We say A C N is central* if An B ^ for every central set BCN. 

Recall that x is said to be uniformly recurrent in X if for every neighborhood V of x the set 
{n I T n (x) £ V} is syndetic, i.e., of bounded gap. Two points x,y £ X are said to be proximal if 
for every e > there exists n £ N such that d(T n (x),T n (y)) < e. We remark that from the above 
definition, it is not at all evident that central sets are IP-sets. We later give an alternative definition 
(see Definition 13.51 ) which makes this point clear. The equivalence between the two definitions is 
due to Bergelson and Hindman Q. 

The question of determining whether a given subset A C N is an IP-set or a central set is typi- 
cally quite difficult, even if for every A, either A or its complement is an IP-set (resp. central set). 
It turns out that in each case this question may be reformulated in terms of whether or not the set A 
belongs to a certain class of ultrafilters on N (see Theorem 5. 12 in Il26ll in the case of IP-sets and 
in the case of central sets). But the question of belonging or not to a given (non-principal) ultrafilter 
is generally equally mysterious. An equivalent word combinatorial reformulation of this question 
is as follows: Given a binary word u> = u UiU)2 ■ ■ ■ £ {0, put cj| q = {n £ N | tu n = 0} 
and u\ = {n £ N | u n = 1}. The question is then to determine whether the set u | or u ^ is an 
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IP-set or central set. Of course in general, this reformulation is as difficult as the original question. 
However, should the word uj be characterized by some rich combinatorial properties, or be gener- 
ated by some "simple" combinatorial or geometric algorithm (such as a substitution rule, a finite 
state automaton, a Toeplitz rule...) or arise as a natural coding of a reasonably simple symbolic 
dynamical system, then the underlying rigid combinatorial structure of the word may provide in- 
sight to our previous question. Furthermore, such families of words may be used to obtain simple 
constructions of central sets having additional nice properties inherited from the rich underlying 
combinatorial structure. One of our objectives here is to illustrate this latter point. 

Let A denote a finite non-empty set (called the alphabet) and uj = ujoujiuJ2 . . . E A N . For each 
finite word u on the alphabet A we set 

u\ u = {n E N|w n w n+ i...cd n+ | u |_i = u}. 

In other words, uj\ denotes the set of all occurrences of u in uj. 

In this paper we investigate partitions of N by sets of the form uj | u defined by words uj of low factor 
complexity. Our goal is to study these partitions in the framework of IP-sets and central sets. All 
infinite words to E A N considered in this paper are uniformly recurrent. As we shall see, in our 
framework IP-sets and central sets are one and the same: 

Theorem 1. Let uj E A N be uniformly recurrent. Then the set uj\ is an IP-set if and only if it is a 
central set. 

The above theorem allows us to simultaneously state our results in terms of IP-sets and central sets. 

We begin by considering the simplest aperiodic infinite words, namely Sturmian words. Stur- 
mian words are infinite words over a binary alphabet having exactly n + 1 factors of length n for 
each n > 0. Their origin can be traced back to the astronomer J. Bernoulli III in 1772. A funda- 
mental result due to Morse and Hedlund PTfl states that each aperiodic (meaning non- ultimately 
periodic) infinite word must contain at least n + 1 factors of each length n > 0. Thus Sturmian 
words are those aperiodic words of lowest factor complexity. They arise naturally in many different 
areas of mathematics including combinatorics, algebra, number theory, ergodic theory, dynamical 
systems and differential equations. Sturmian words are also of great importance in theoretical 
physics and in theoretical computer science and are used in computer graphics as digital approxi- 
mation of straight lines. 

Let uj E {0, 1} N be a Sturmian word, and let f2 denote the shift orbit closure of uj. Then f2 
contains a unique word to (called the characteristic word) having the property that both OcD, \ui E 
CI. In order to state our results, we must distinguish between two cases: 

Definition 1.3. A Sturmian word uj is called nonsingular if it does not contain the characteristic 
word uj as a proper tail. Otherwise it is said to be singular. 

Theorem 2. Let uj EVlbe a nonsingular Sturmian word, and u a factor ofuj. Then uj\ is an IP-set 
(resp. central set) if and only if u is a prefix of uj. In other words, for every prefix u of uj, the set 
uj\ is an IP* - set (resp. central* -set). 

As a corollary we deduce that 

Corollary 1. Let uj EVtbe a nonsingular Sturmian word. For every factor v of uj and n E u\ the 

setuj\ — n is a central* set. 
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We note that in general the property of being an IP*-set is not translation invariant. See also 
Theorem 1.1 in tH. As an immediate consequence to the previous corollary, we have 

Corollary 2. For each r > 1 there exists a partition ofN into sets Aq, A±, . . . , A r such that for 
each < i < r and n G N, exactly one of the sets {Aq — n, A\ — n, . . . , A r — n} is an IP* -set 
(resp. central* set). 

In fact, given r > 1, let a; be any nonsingular Sturmian word (for instance the Fibonacci word) 
and let jF^ir) denote the set of all factors of to of length r. Then the r + 1 sets u\ with u E ^(r) 
define a partition of N with the required property. 

For singular Sturmian words to we have 

Theorem 3. Let to e f2 be a Sturmian word such that T n °(to) = to with n > 1. Then cj| is an 
IP-set ( resp. central set) if and only if either u is a prefix of uj or a prefix of <J where <J is the 
unique other element ofQ with T n ° (u/) = Q. 

Some (but not all) of the results on Sturmian partitions extend to so-called Arnoux-Rauzy words, 
which may be regarded as natural combinatorial extensions of Sturmian words to larger alphabets 

ID. 

We also consider partitions defined by words generated by substitution rules. For instance, by 
considering partitions of N defined by words generated by the generalized Thue-Morse substitution 
to an alphabet of size r > 2, we show that 

Theorem 4. For each pair of positive integers r and N there exists a partition of 

N = Ai U A 2 U ■ • • U A r 

such that 

• Ai — n is a central set for each 1 < % < r and 1 < n < N. 

• For each n > N, exactly one of the sets {A\ — n, A% — n, . . . , A r — n} is a central set. 

The second assertion of Theorem |4]relies on the fact that each fixed point of the generalized Thue- 
Morse substitution is distal. At least in the case of the Thue-Morse substitution itself this may 
already be known, but the authors have been unable to locate this result anywhere in the literature. 
Our proof of this fact uses a result of V. Baker, M. Barge and J. Kwapisz which states that for 
subshifts (X, T) generated by primitive substitutions of Pisot type, the maximal equicontinuous 
factor 7r : X — > X e q is finite to one (3). 

By considering partitions defined by words generating minimal subshifts which are topologically 
weak mixing (for example the subshift generated by the substitution ^ 001 and 1 11001) we 
prove that 

Theorem 5. For each positive integer r there exists a partition ofN = A\ U A 2 U • • • U A r such 
that for each 1 < % < r and n > 0, the set Ai — n is a central set. 

We also consider words on infinite alphabets. Via iterated palindromic closures (see Defini- 
tion 17.11 ), we construct a uniformly recurrent infinite word to on an infinite alphabet A which 
gives rise to an infinite partition of N into central sets: 



CENTRAL SETS DEFINED BY WORDS OF LOW FACTOR COMPLEXITY 



5 



Theorem 6. Let A be a right infinite word on a finite or infinite alphabet A with the property that 
each letter a £ A occurs in A an infinite number of times. Let ip denote the iterated palindromic 
operator and set uj = ip(A). Then 

(1) uj is uniformly recurrent and closed under reversal, i.e., ifv = v x v 2 ■ ■ ■ v & is a factor of uj, 
then so is its mirror image v k . . . V2V1. 

(2) The set oj\ + 1 is a central set for each letter a £ A. 

In particular if we take the word A to be on an infinite alphabet, the sets {uj\ a + l} a&j \form a 
countably infinite collection of pairwise disjoint central subsets o/nE 

An important open problem in the theory of substitutions is the so-called strong coincidence 
conjecture which states that each pair of fixed points x and y of an irreducible primitive substitution 
of Pisot type satisfy the following condition called the strong coincidence condition: There exist a 
letter a and a pair of Abelian equivalent words s, t, such that sa is a prefix of x and ta is a prefix 
of y. This combinatorial condition, originally due to P. Arnoux and S. Ito, is an extension of a 
similar condition considered by F.M. Dekking in lfT4ll in the case of uniform substitutions. In this 
case Dekking proves that the condition is satisfied if and only if the associated substitutive subshift 
has pure discrete spectrum, i.e., is metrically isomorphic with translation on a compact Abelian 
group. The strong coincidence conjecture has been verified for irreducible primitive substitutions 
of Pisot type on a binary alphabet by M. Barge and B. Diamond 0|. The following establishes a 
link between the strong coincidence conjecture and central sets: 

Theorem 7. Let r be a primitive substitution verifying the strong coincidence condition. Then for 
any pair of fixed points x and y, and any prefix uofy,we have that x\ is a central set. 

Our proof of Theorem [7] makes use of the so-called Dumont-Thomas numeration systems de- 
fined by substitutions, and constitutes a new approach to the strong coincidence conjecture. 

The main results in this paper rely on various interactions between different areas of mathe- 
matics, some of which had not previously been directly linked: They include the general theory 
of combinatorics on words, the arithmetic properties of abstract numeration systems defined by 
substitutions, topological dynamics, the spectral theory of symbolic dynamical systems, and the 
beautiful theory, developed by Hindman, Strauss and others, linking IP-sets and central sets to 
the algebraic/topological properties of the Stone-Cech compactification /3N. We regard (3N as the 
collection of all ultrafilters on N. An ultrafilter may be thought of as a {0, l}-valued finitely addi- 
tive probability measure defined on all subsets of N. This notion of measure induces a notion of 
convergence (p-lim n ) for sequences indexed by N, which we regard as a mapping from words to 
words. This key notion of convergence allows us to apply ideas from combinatorics on words in 
the framework of ultrafilters. 

The paper is organized as follows: In §2 we present some of the basic ideas and tools from 
combinatorics on words which will be used throughout the paper. In §3 we outline the key features 
of the algebraic and topological properties of the Stone-Cech compactification (3N in connection 
with IP-sets and central sets. Since the material in §2 may be unfamiliar to specialists in topolog- 
ical semigroups and vice-versa, we take some care to explain both topics in an attempt to make 
the paper more accessible. In §4 we analyze some concrete examples which illustrate some of the 

'This is a special case of a prior result of Hindman, Leader and Strauss ||25l in which they show that every central 
set in N is a countable union of pairwise disjoint central sets. 
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results mentioned above in Theorems [2] and [3l We use nothing more than the combinatorial prop- 
erties of the words considered (all generated by substitutions) and the arithmetic properties of the 
underlying Dumont-Thomas numeration system. In §5 we extend the results in §4 to all Sturmian 
words, in particular those not generated by substitutions. Here we make use of the algebraic prop- 
erties of the semigroup f3N. In §6 we consider partitions defined by the generalized Thue-Morse 
substitution and prove TheoremSl Also in §6 we prove Theorem[5]by considering subshifts which 
are topologically weak mixing. In §7 we consider some infinite words on an infinite alphabet gen- 
erated by iteration of the palindromic closure operator. Using these words we construct infinite 
partitions of N and prove Theorem [6l Finally in §8, after a brief review of the Dumont-Thomas 
numeration systems defined by substitutions, we discuss a connection between central sets and the 
strong coincidence condition for substitutions. 

Acknowledgements. The authors would like to thank V. Bergelson and Y. Son for many insightful 
e-mail exchanges pertaining particularly to the last section of the paper, and for pointing out to us 
the key feature used in the proof of Theorem [5]relating topologically weak mixing with proximality. 
We are also extremely grateful to M. Barge for pointing out to us his joint work with V. Baker and 
J. Kwapisz used in the proof of Theorem|4j Finally we wish to thank N. Hindman for his comments 
and suggestions on a preliminary version of the paper. The third author is partially supported by 
a grant from the Academy of Finland and by grant no. 090038011 from the Icelandic Research 
Fund. 

2. Words and substitutions 

In this section we give a brief summary of some of the basic background in combinatorics on 
words. 

2.1. Words & subshifts. Given a finite non-empty set A (called the alphabet), we denote by A*, 
A N and A z respectively the set of finite words, the set of (right) infinite words, and the set of 
bi-infinite words over the alphabet A. Given a finite word u = a\a 2 ■ ■ ■ a n with n > 1 and Oj E A, 
we denote the length n of u by \u\. The empty word will be denoted by e and we set \e\ = 0. We 
put A + = A* — {e}. For each a E A, we let \u\ a denote the number of occurrences of the letter a 
in u. Two words u and v in A* are said to be Abelian equivalent, denoted u v, if and only if 
\u\ a — Ma for all a G A It is readily verified that defines an equivalence relation on A*. 

Given an infinite word to E A N , a word u E A + is called & factor of to if u = tOitOi + i ■ ■ ■ tOi +n for 
some natural numbers i and n. We denote by J-L(n) the set of all factors of u of length n, and set 

neN 

A factor u of to is called right special if both ua and ub are factors of cu for some pair of distinct 
letters a,b E A. Similarly u is called left special if both au and bu are factors of u for some pair of 
distinct letters a,b E A. The factor u is called bispecial if it is both right special and left special. 
For each factor u E JF^ set 

u)\ u = {n E N | u n u n+1 . . .w re+ | w |_i = u}. 

We say cu is recurrent if for every u E T w the set to | is infinite. We say to is uniformly recurrent if 
for every u E T w the set cu\ is syndedic, i.e., of bounded gap. 
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We endow A N with the topology generated by the metric 

d(x, y) = — where n = M{k : x k ^ y k } 

whenever x = (x n )neN and y = (y n )neN are two elements of A N . Let T : A N — > A N denote the 
shift transformation defined by T : (x n ) n eN ^ (x n+ i) ne ^. By a subshift on A we mean a pair 
(X, T) where X is a closed and T-invariant subset of A N . A subshift (X, T) is said to be minimal 
whenever X and the empty set are the only T-invariant closed subsets of X. To each cu £ A N is 
associated the subshift (X, T) where X is the shift orbit closure of cu. If cu is uniformly recurrent, 
then the associated subshift (X, T) is minimal. Thus any two words x and y in X have exactly the 
same set of factors, i.e., JF X = T y . In this case we denote by Tx the set of factors of any word 

x e x. 

Two points x, y in X are said to be proximal if and only if for each N > there exists n £ N 
such that 

x n x n+l ■ ■ ■ X n+N = ynyn+1 ■ ■ ■ 2/n+iV- 

Two points x, y £ X are said to be regionally proximal if for every prefix u of x and v of y, there 
exist points x', y' £ X with x' beginning in u and y' beginning in v and with x' proximal to y'. 
Clearly if two points in X are proximal, then they are regionally proximal. A point x £ X is 
called distal if the only point in X proximal to x is x itself. A minimal subshift (X, T) is said to 
be topologically mixing if for every any pair of factors u, v £ Tx there exists a positive integer N 
such that for each n > N, there exists a block of the form uWv £ with \ W\ = n. A minimal 
subshift (X, T) is said to be topologically weak mixing if for every any pair of factors n, v £ Tx 
the set 

{n £ N | uA n v n ^ 0} 
is thick, i.e., for every positive integer X, the set contains N consecutive positive integers. 

2.2. Substitutions. Many of the words and subshifts considered in this paper are generated by 
substitutions. A substitution r on an alphabet A is a mapping r : A — > A + . The mapping r 
extends by concatenation to maps (also denoted r) A* — > A* and A N — > A N . The Abelianization 
of r is the square matrix M T whose ij-th entry is equal to |r(j) | i; i.e., the number of occurrences of 
i in r(j). A substitution r is said to be primitive if there is a positive integer n such that for each pair 
(i, j) £ A x ^4, the letter i occurs in r n (j). Equivalently if all the entries of are strictly positive. 
In this case it is well known that the matrix M T has a simple positive Perron-Frobenius eigenvalue 
called the dilation of r. A substitution r is said to be irreducible if the minimal polynomial of its 
dilation is equal to the characteristic polynomial of its Abelianization M T . A substitution r is said 
to be of Pisot type if its dilation is a Pisot number. Recall that a Pisot number is an algebraic integer 
greater than 1 all of whose algebraic conjugates lie strictly inside the unit circle. 

Let r be a primitive substitution on A. A word cu £ ^4 N is called a fixed point of r if r(co) = cu, 
and is called a periodic point if r m (cu) = cu for some m > 0. Although r may fail to have a 
fixed point, it has at least one periodic point. Associated to r is the topological dynamical system 
(X, T), where X is the shift orbit closure of a periodic point cu of r. The primitivity of r implies 
that (X, T) is independent of the choice of periodic point and is minimal. 

An important example of a primitive substitution is the Thue-Morse substitution defined by the 
morphism i— > 01 and 1 i-> 10. It has two fixed points 



u = 011010011001011010010110011010. 
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and 

v = 100101100110100101101001100101 . . . 

where u n = 1 — v n for every n > 0. Alternatively, it can be shown that u n is equal to if and 
only if the binary expansion of n contains an even number of Is. For example, u 5 = u 6 = 0, 
and in fact 5 = 101 and 6 = 110 expressed in base 2. Two other primitive substitutions we will 
make reference to, first introduced some thirty years ago by F.M. Dekking and M. Keane, are the 
substitutions H- 001, 1 i-> 11100 and H- 001, 1 H- 11001. Both have two fixed points, and have 
the same Abelianization. It is shown in lfT5l that the subshift generated by the first substitution is 
topologically mixing, but not the second. But both are topologically weak mixing. 

2.3. Sturmian words & generalizations. Let u E A n and set 

p u [n) = Card(J r w (n)). 

The function p w : N — > N is called the factor complexity function of to. Given a minimal subshift 
(X, T) on A, we have J- U (n) = .7v(n) for all to,to' E X and n E N. Thus we can define the factor 
complexity P(x,t) (n) of a minimal subshift (X, T) by 

P(x,T){n) = p u (n) 

for any u E X. 

A word u) E A N is periodic if there exists a positive integer p such that Ui +P = coi for all 
indices i, and it is ultimately periodic if Ui +P = coi for all sufficiently large i. An infinite word is 
aperiodic if it is not ultimately periodic. By a celebrated result due to Hedlund and Morse QTl , a 
word is ultimately periodic if and only if its factor complexity is uniformly bounded. In particular, 
Pu{n) < n for all n sufficiently large. Words whose factor complexity p u (n) = n + 1 for all 
n > are called Sturmian words. Thus, Sturmian words are those aperiodic words having the 
lowest complexity. Since p w (l) = 2, it follows that Sturmian words are binary words. The most 
extensively studied Sturmian word is the so-called Fibonacci word 

f = 01001010010010100101001001010010010100101001001010010 • ■ • 

fixed by the morphism i — ^ 01 and 1 i — 0. Let co E {0, 1} N be a Sturmian word, and let denote 
the shift orbit closure of cu. The condition p w (n) = n + 1 implies the existence of exactly one right 
special and one left special factor of each length. Clearly, given any two left special factors, one is 
necessarily a prefix of the other. It follows that f2 contains a unique word all of whose prefixes are 
left special factors of u. Such a word is called the characteristic word and denoted Co. It follows that 
both Ooj, Ioj E Cl. It is readily verified that the Fibonacci word above is a characteristic Sturmian 
word. A Sturmian word to is called singular if T n (cu) = Co for some n > 1. Otherwise it is said to 
be nonsingular. 

Sturmian words admit various types of characterizations of geometric and combinatorial nature. 
We give two such characterizations which will be used in the paper: as irrational rotations on 
the unit circle and as mechanical words. In [31] Hedlund and Morse showed that each Sturmian 
word may be realized measure-theoretically by an irrational rotation on the circle. That is, every 
Sturmian word is obtained by coding the symbolic orbit of a point x on the circle (of circumference 
one) under a rotation R a by an irrational angle a, < a < 1, where the circle is partitioned into 
two complementary intervals, one of length a and the other of length 1 — a. And conversely 
each such coding gives rise to a Sturmian word. The quantity a is called the slope. Namely, the 
rotation by angle a is the mapping R a from [0, 1) (identified with the unit circle) to itself defined 
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by R a (x) = {x + a}, where {x} — x — [x] is the fractional part of x. Considering a partition of 
[0, 1) into Iq = [0, 1 — a), I\ — [1 — a, 1), define a word 



s <xJn) 



if R n a (p) = {p + na}El , 
if Rl{p) = {p + na} E h 

One can also define I' = (0, 1 — a], I' x — (1 — a, 1], the corresponding word is denoted by s'. 
For a Sturmian word w of slope a its subshift f2 is given by f2 = {s Q p , s' \p E [0, 1)}. 
A straightforward computation shows that 

s a,p( n ) = l a ( n + 1) + Pj - l an + P\ > 

s a I/9 ( n ) = \ a ( n + 1) + p! - f an + p! ; 

s at p and s' are called the upper and /owr mechanical words (of slope a) based at p. 

In [0Q| Arnoux and Rauzy introduced a class of uniformly recurrent (minimal) sequences u on 
a m-letter alphabet of complexity Pu{n) — (m — l)n + 1 characterized by the following combi- 
natorial criterion known as the * condition: u admits exactly one right special and one left special 
factor of each length. We call them Arnoux-Rauzy sequences. This condition distinguishes them 
from other sequences of complexity (m — l)n + 1 such as those obtained by coding trajectories 
of m-interval exchange transformations. These words are generally regarded as natural combi- 
natorial generalizations of Sturmian words to higher alphabets. In particular, the Fibonacci word 
generalizes to the m-bonacci word fixed by the substitution 

a m : {0, 1, . . . , m - 1} -)• {0, 1, . . . , m - 1}* 

given by 



0(i + 1) for < % < m 
for i = m — 1 



However, many of the dynamical and geometrical interpretations of Sturmian words do not 
extend to this new class of words (see lfT2ll for example). 

In the subsequent sections we will consider partitions of N defined by words. Let u E A N , and 
let T denote the set of factors of u. A finite subset X is called a F-prefix code if X C T and given 
any two distinct elements of X, neither one is a prefix of the other. A ^-prefix code is T -maximal 
if it is not properly contained in any other ^-prefix code. The simplest example of a ^-maximal 
prefix code is the set of all elements of T of some fixed length d. Each ^-maximal prefix code X 
defines a partition 



N=U 



L)\ 

If to is a Sturmian word, then the corresponding partition is called a Sturmian partition. 

3. ULTRAFILTERS, IP- SETS AND CENTRAL SETS 

3.1. Stone-Cech compactification. Many of our results rely on the algebraic/topological prop- 
erties of the Stone-Cech compactification of N. The Stone-Cech compactification /3N of N is one 
of many compactifications of N. It is in fact the largest compact Hausdorff space generated by 
N. More precisely f3N is a compact and Hausdorff space together with a continuous injection 
i : N f3N satisfying the following universal property: any continuous map / : N — > X into a 
compact Hausdorff space X lifts uniquely to a continuous map 13 f : /3N — > X, i.e., / = (5f o i. 
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This universal property characterizes f3N uniquely up to homeomorphism. While there are differ- 
ent methods for constructing the Stone-Cech compactification of N, we shall regard (3N as the set 
of all ultrafilters on N with the Stone topology. 

Recall that a set hi of subsets of N is called an ultrafilter if the following conditions hold: 

• iu. 

• If A G U and A C B, then B eU. 

• AD B Ehl whenever both A and B belong to hi. 

• For every ACN either A E hi or A c E hi where A c denotes the complement of A. 

For every natural number n E N, the set hl n = {A C N | n E A} is an example of an ultrafilter. 
This defines an injection i : N ^ (3N by: n i— >■ U n . An ultrafilter of this form is said to be principal. 
By way of Zorn's lemma, one can show the existence of non-principal {ox free) ultrafilters. 

It is customary to denote elements of /3N by letters p,q,r For each set A C N, we set 

A° = {p g f3N\A G p}. Then the set B = {A°\A C N} forms a basis for the open sets (as well 
as a basis for the closed sets) of (3N and defines a topology on (3N with respect to which (3N is 
both compact and Hausdorff|j It is not difficult to see that the injection i : N ^ /3N is continuous 
and satisfies the required universal property. In fact, given a continuous map / : N — > X with X 
compact Hausdorff, for each ultrafilter p G (3N, the pushfoward f(p) = {f(n) \ n G p] defines an 
ultrafilter on X having a unique limit point (3f(p). 

There is a natural extension of the operation of addition + on N to (3N making (3N a compact left- 
topological semigroup. More precisely we define addition of two ultrafilters p, q by the following 
rule: 

p + q = {AC^\{nE N\A -n G p} G q}. 

It is readily verified that p + q is once again an ultrafilter and that for each fixed p G f3N, the 
mapping q i-> p + q defines a continuous map from (3N into itself Jl The operation of addition in /3N 
is associative and for principal ultrafilters we have hl m +hl n = U m+n . However in general addition 
of ultrafilters is highly non-commutative. In fact it can be shown that the center is precisely the set 
of all principal ultrafilters [|26l . 

3.2. IP-sets and central sets. Let (5, +) be a semigroup. An element p G S is called an idempo- 
tent if p + p = p. We recall the following result of Ellis [|20l : 

Theorem 3.1 (Ellis |f20j ). Let (S, +) be a compact left-topological semigroup (i.e., Wx G S the 
mapping y \-> x + y is continuous). Then S contains an idempotent. 

It follows that 0N contains a non-principal ultrafilter p satisfying p + p = p. In fact, we could 
simply apply Ellis's result to the semigroup N — {0}. This would then exclude the only principal 
idempotent ultrafilter, namely hi®. From here on, by an idempotent ultrafilter in (3N we mean a free 
idempotent ultrafilter. 

We will make use of the following striking result due to Hindman linking IP-sets and idempo- 
tents in /3N : 

2 Although the existence of free ultrafilters requires Zorn's lemma, the cardinality of (3N is 2 2N from which it follows 
that f3N is not metrizable. 

3 Our definition of addition of ultrafilters is the same as that given in |6] but is the reverse of that given in l26l in 
which A G p + q if and only if {n 6 N\A — n G q} G p}. In this case, (3N becomes a compact right-topological 
semigroup. 
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Theorem 3.2 (Theorem 5.12 in 112610 . A subset A C N is an IP-set if and only if A G pfor some 
idempotent p G f3N. 

It follows immediately that A is an IP*-set if and only if A G p for every idempotent p G f3N (see 
Theorem 2.15 in [6]). We also note that the property of being an IP-set is partition regular. 

To see the connection between idempotent ultrafilters and IP- sets, consider a set A C N be- 
longing to some idempotent p G /3N. Then as A G p + p it follows that there exist xq G A such 
that A fi A — xq G p. Set Ai = A DA — x . Since A i G p+p we can choose x\ G A\ ix\ ^ x ) 
such that Ai fl Ai — X\ G p. Note that thus far we have x , X\ and x + X\ all belong to A . Set 
A 2 = AiP\Ax —X\. Again since A 2 G p+p we can choose x 2 G A 2 (distinct from both x , X\) such 
that A 2 f]A 2 — x 2 G p. Since x 2 G A 2 , it follows that a; 2 , x 2 + X! £ Ai C A . Since x 2 , x 2 +Xi G Ax 
it follows that x 2 + x , x 2 + Xi + x G v4 . Thus {x , Xi,x 2 ,x + Xi,x + x 2 ,Xi + x 2 ,Xq+Xi+x 2 } C 
Aq. Iterating this process we obtain an infinite sequence of distinct points (x n ) n ^ such that for 
any finite subset FcN the sum J2 n eF x « belongs to A . In other words, A is an IP-set. 

In 11221 . Furstenberg introduced a special class of IP-sets, called central sets, having additional 
rich combinatorial properties. They were originally defined in terms of topological dynamics (see 
Definition 1 1.21) . As in the case of IP-sets, they may be alternatively defined in terms of belonging 
to a special class of free ultrafilters, called minimal idempotentfl To define a minimal idempotent 
we must first review some basic properties concerning ideals in (3N. 

Let (iS, +) be any semigroup. Recall that a subset X C S is called a right (resp. left) ideal if 
I + S C X (resp. 5 + ICI). It is called a two sided ideal if it is both a left and right ideal. A 
right (resp. left) ideal X is called minimal if every right (resp. left) ideal J included in X coincides 
with X. 

We recall some useful facts concerning minimal right ideals of a semigroup (similar considera- 
tions apply to minimal left ideals): 
Facts: 

(1) Let M. be a minimal right ideal of S. Then every element x in Ai generates Ai in the sense 
that M=x + S = x + M. 

(2) If 1Z is a right ideal of S with the property that 1Z = x + 1Z for every x G 1Z, then 1Z is a 
minimal right ideal. 

(3) Let Ai be a minimal right ideal of 5. Then Ai = x + .M for every x G <S. 

(4) Every minimal right ideal Ai is contained in every two sided ideal X. 

Minimal right/left ideals do not necessarily exist e.g. the commutative semigroup (N, +) has no 
minimal right/left ideals (the ideals in N are all of the form X n = [n, +00) = {m G N | m > n}.) 
However, 

Proposition 3.3. Every compact Hausdorff left-topological semigroup (e.g., f3N) admits a minimal 
right ideal and a minimal left ideal. 

Let Ai be a minimal right ideal of a left-topological semigroup. Since Ai is of the form x + S 
with x G Ai, it follows that Ai is closed. Thus Ai is a compact left-topological semigroup and 
hence by Ellis [|20ll contains an idempotent p. It is verified that S + p is then a minimal left ideal, 
that p G S + p and that p + Sf]S + p = p + S + pisa group. More generally the intersection of 
any minimal right ideal with any minimal left ideal is a group and hence contains an idempotent. 



4 The equivalence between the two definitions is due to Bergelson and Hindman Q. 
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Let K(S) denote the union of all minimal right ideals of S. Then K(S) is a two sided ideal and 
is in fact the smallest such ideal. To see this we first note that K(S) is a right ideal (being the 
union of right ideals). To see that K(S) is also a left ideal, let x E K(S) and y E S. Then x E M 
for some minimal right ideal Ai. Thus y + x E y + M which by Fact (3) is a minimal right ideal. 
Hence y + x E K(S). This shows that K(S) is a two sided ideal of S. By Fact (4) it follows that 
K(S) is contained in every two sided ideal I. 

We could have defined K (S) to be the union of all minimal left ideals of S and in an analogous 
way deduced that K(S) is the smallest two sided ideal of S. Thus 

K(S) = [j{C \C is a minimal left ideal of S} = [j{TZ \K is a minimal right ideal of S}. 
Definition 3.4. An idempotent p is called a minimal idempotent if it belongs to K(S). 

Thus as every compact left-topological semigroup (e.g. f3N) contains a minimal right ideal, 
and by Ellis every minimal right ideal contains an idempotent, we deduce that every compact 
left-topological semigroup contains a minimal idempotent. Alternatively, given two idempotents 
p,q E S we write p ^ q if 

p + q = q + p = p. 

It turns out that an idempotent p is minimal if and only if it is minimal with respect to the relation 

^ . 

Definition 3.5. A subset A c N is called central if it is a member of some minimal idempotent in 
j3N. It is called a central* -set if it belongs to every minimal idempotent in /5N. 

It follows from the above definition that every central set is an IP-set and that the property of 
being central is partition regular. Central sets are known to have substantial combinatorial struc- 
ture. For example, any central set contains arbitrarily long arithmetic progressions, and solutions 
to all partition regular systems of homogeneous linear equations (see for example [8]). Many of the 
rich properties of central sets are a consequence of the so-called Central Sets Theorem first poved 
by Furstenberg in Proposition 8.21 in [|22l (see also lfT3l l8l l27l0. Furstenberg pointed out that as 
an immediate consequence of the Central Sets Theorem one has that whenever N is divided into 
finitely many classes, and a sequence (x n ) n6 N is given, one of the classes must contain arbitrarily 
long arithmetic progressions whose increment belongs to {J2 n eF x n\F £ Fin(N)}. 

3.3. Limits of ultrafilters. It is often convenient to think of an ultrafilter p as a {0, l}-valued, 
finitely additive probability measure on the power set of N. More precisely, for any subset ACN, 
we say A has p-measure 1, or is p-large if A E p. This notion of measure gives rise to a notion of 
convergence of sequences indexed by N which is the key tool in allowing us to apply ideas from 
combinatorics on words to the framework of ultrafilters. However, from our point of view, it is 
more natural to define it alternatively as a mapping from words to words (see Remark 13. 131 ). Let 
A denote a non-empty finite set. Then each ultrafilter p E (3N naturally defines a mapping 

p*:A N ^ A N 

as follows: 

Definition 3.6. For each p E (3N and u E A N , we define p*(cu) E A N by the condition: u E A* is 
a prefix qfp*(u) u\ E p. 
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We note that if u, v £ A*, u\ , u\ £ p and \v\ > \u\, then u is a prefix of v. In fact, if v' denotes 
the prefix of v of length \u\ then as uj\ v C u\ n it follows that u\ , £ p and hence u = v' . Thus 
p* (u) is well defined. 

We note that if lo, v £ A N and if each prefix u of v is a factor of to, then there exists an ultrafilter 
p £ f3N such that p* (u) = v. In fact, the set 

C = {u\ u \u h a prefix of u} 

satisfies the finite intersection property, and hence by a routine argument involving Zorn's lemma 
it follows that there exists ap £ /3N with C C p. 

It follows immediately from the definition of p*, Definition 13 .5 1 and Theorem 13 .21 that 

Lemma 3.7. The set to | is an IP-set ( resp. central set) if and only ifu is a prefix ofp* (co) for some 
idempotent (resp. minimal idempotent) p £ 0N. 

Lemma 3.8. For each p £ /3N, u £ A n and u £ A* we have 

p*(co) | = {m £~N\u\ u — m £ p} 
where ui\ —mis defined as the set of all n £ N such that n + m £ ui\ . 



Proof. Suppose m £ p*(co) | . Then by definition u occurs in position m in p*(u). Let v denote the 
prefix of p*(co) of length |i>| = m + \u\. Then, as u is a suffix of f we have cj| + m C u;| and 
hence u; Cw — m. But as u is a prefix of p*(o;) we have ui\ £ p and hence u\ — m £ p as 
required. 

Conversely, fix m G N such that to | — m £ p. Let Z be the set of all factors v of cu of length 
|u| = m + |w| ending in u. Then 

a; I — m C wl . 

It follows that there exists v £ Z such that cu\ £ p. In other words, there exists v £ Z such that v 
is a prefix of p* (u) . It follows that u occurs in position m in p* (u) . □ 

Lemma 3.9. For p,q £ f3N and u £ A n , we have (p + = q*(p*(u)). In particular, ifp is 

an idempotent, then p*(p*(u)) = 

Proof. For each word u £ A* we have that u is a prefix of (p + q)*(u)) if and only if 

u)\ £ p + q <^=^ {m £ N | ui\ — m £ p} £ q. 

On the other hand, u is a prefix of q*(p*(co)) if and only if p*(tu)\ u £ q. The result now follows 
immediately from the preceding lemma. □ 

Lemma 3.10. For each p £ f3N and a £ A n we have p*{T(u)) = T(p*{u)) where T : A N ^ A N 
denotes the shift map. 

Proof. Assume u £ A* is a prefix of p*(T(uj)). Then T(u)\ u £ p. But 

T{u)\ = I H . 

It follows that there exists a £ A such that ui\ £ p. Thus au is a prefix of p*(co) and hence u is a 
prefix of T(p*(u)). □ 

In what follows, we will make use of the following key result in [|6l (see also Theorem 1 in |[T0l0 : 



14 



M. BUCCI, S. PUZYNINA, AND L.Q. ZAMBONI 



Theorem 3.11 (Theorem 3.4 in [[6]|). Let (X, T) be a topological dynamical system. Then if two 
points x, y G X are proximal with y uniformly recurrent, then there exists a minimal idempotent 
p G (3N such thatp*(x) = y. 

As a consequence we have 

Theorem 3.12. Let cu G A N be a uniformly recurrent word, and let u G A + . Then u\ is an IP-set 
if and only ifu\ u isa central set. 

Proof. For any A C N we have that if A is central then A belongs to some minimal idempotent 
p G f3N and hence in particular A belongs to an idempotent in (3N. Hence by Theorem 13.21 we 
have that A is an IP-set. Now suppose that ui\ is an IP-set. Then to\ belongs to some idempotent 
p G f3N. Set v = P*(uj). Then u is a prefix of v. Also, since p is idempotent we have p*(v) = 
p*(p*(ui)) = p*(to) = v. Hence for every prefix v of v we have that v\ G p and ui\ G p and hence 
v\ nw G p. In particular v\ flw ^ 0. Hence to and v are proximal. Since to is uniformly 
recurrent, it follows that v is also uniformly recurrent. Hence by Theorem 13.1 II there exists a 
minimal idempotent q with q*(co) = v. Hence u\ G q, whence u\ is central. □ 

Remark 3.13. It is readily verified that our definition of p* coincides with that of p-lim n . More 
precisely, given a sequence (x n ) ne ^ in a topological space and an ultrafilter p G (3N, we write 
p-lim n x n = y if for every neighborhood U y of y one has {n\x n G U y } G p. In our case we have 
p*(cu) = p-hm n (T n (cu)) (see 11241 ). With this in mind, the preceding two lemmas are well known 
(see for instance |fT0ll2"4"l0 . However, our defining condition of p* in Definition |3.6| does not directly 
rely on the topology and so may be applied in other general settings. For instance, let Q C A N be 
a subshift, and J\f = {n < n\ < n 2 < ■ ■ ■ } an infinite sequence of natural numbers. For each 
wGflwe put 

Xjf = {u n+no U n+ni . . . Un+nu-! I n > 0} C A k . 
For each u G X^f we define the set 

Then the sets u™\ with u G X^f partition N. So, given p G /3N, for each k > 1 there exists a 
unique u G X^ with u™\ G p. Moreover if v G X{f +1 and u™\ G p, then u is a prefix of v. 
So using the condition in Definition 13 .61 each infinite sequence Af and ultrafilter p G /?N defines 
a mapping Q — > Q. Of particular interest is the case in which f2 is a uniform set in the sense of T. 
Kamae and J\f is chosen such that u[Af] is a super- stationary set (see [|28ll29l ). 

Another situation in which the defining condition of Definition 13.61 applies is in the context of 
infinite permutations [T27Tl . By an infinite permutation ir we mean a linear ordering on N. Then for 
each finite permutation u of {1,2,..., n} we say that u occurs in position m of n if the restriction 
of 7r to {m, m + 1, . . . , m + n — 1} is equal to it. Thus we may define the set n\ u as the set of 
all m G N such that u occurs in position m in ir, and again the sets ir\ u (over all permutations u 
of {1, 2, ... , n}) determine a partition of N. Hence each p G /3N defines a map from the set of all 
infinite permutations into itself. 

4. A FIRST ANALYSIS OF SOME CONCRETE EXAMPLES 

4.1. The Fibonacci word. While most of the proofs of the results announced in the Introduction 
rely on the algebraic and topological properties of ultrafilters on N and their links to IP-sets, we 
begin by analyzing concretely a few examples generated by simple substitution rules. To establish 
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that certain subsets of N are IP-sets, we will use nothing more than the definition of IP-sets and 
the abstract numeration systems defined by substitutions first introduced by J.-M. Dumont and A. 
Thomas ATI [TBI. 

Let us begin with the Fibonacci infinite word f = /0/1/2 • • • £ {0, 1} N given by 

f = 01001010010010100101001001010010010100101001001010010 • • • 

We set 

f lo = i n e N|/ n = 0} 

and 

f| 1 = {nGN|/ B = l}. 

Sof| = {0,2,3,5,7,8,10,11,13,15, 16,...} and f| x = {1, 4, 6, 9, 12, 14, 17, . . .}. This defines 
the Sturmian partition N = f | U f | . Let us denote by F n the nth Fibonacci number so that F = 

1, F\ — 2, F 2 = 3, It is well known that each positive integer n has one or more representations 

when expressed as a sum of distinct Fibonacci numbers, i.e., n = 5Z* =0 UFi with U G {0, 1} and 
tk — 1. We call the associated {0, l}-word tktk-x •• - to a representation of n. For example, for 
n = 50 we obtain the following 6 representations (arranged in decreasing lexicographic order): 

10100100 
10100011 
10011100 
10011011 

1111100 

1111011 

The lexicographically largest representation is obtained by applying the greedy algorithm. This 
gives rise to a representation of n of the form n = Yli=o *»-^ w ^ tn U+iU ^ 11 for each < i < 
k — 1. This representation of n is called the Zeckendorff representation ||32l (a special case of the 
Dumont-Thomas numeration system |[T7l [T8l0 . We shall write Z(n) = t k t k _i . . . t . It follows 
immediately that Z{F n ) = 10™. The connection between Z(n) and the entry f n of the Fibonacci 
word f is given by the following well known fact: f n = whenever Z(n) ends in and f n — 1 
whenever Z(n) ends in 1. Thus 

f| o = { n G N|Z(n) ends inO} 

and 

f I = \n G N I Z(n) ends in 1}. 

We now consider the sequence (x n ) neN given by x n = F 2n+ \. It is readily verified that for each 
A G Fin(N), the Zeckendorff representation of ^2 n&A x n ends in 10 2m+1 where m = mm(A). 
In fact, the symbolic sum of the individual Zeckendorff representations of each x n occurring in 
SneA x n does not involve any carry overs. Moreover the resulting expression does not contain 
any occurrences of 11 and hence is equal to the Zeckendorff representation of Ylm^A x n- Thus 
every finite sum of the form J2 n ^A x « w ^ m ^ e Fin(N) belongs to f | . Thus we have shown that 
f I is an IP-set. 

We next verify that f | is not an IP-set, and hence f | is an IP*-set. We will use the follow- 
ing general observation. Consider a subset A C N partitioned into k > non-intersecting sets: 
A = Ax U A 2 U • ■ ■ U A k . Suppose that for each 1 < j < k there exists a positive integer 
N (which may depend on j) such that whenever mx,m 2 , . . . ,m N are distinct elements of Aj, 
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we have Y^Li m « ^ A. Then A is not an IP-set. In fact, if A were an IP-set, then for some 
1 < j < k, there would exist a sequence x± < x 2 < x 3 < ■ ■ ■ contained in Aj such that 

{I2neF x n\F S Fin(N)} C A. 

Let a = 3 ~ 2 v/ ^ . Then the Fibonacci word f is the orbit of the point a under irrational rotation R a 
on the unit circle by a. Let / be the interval [1 — a, 1) (the interval coded by 1). So n £ f | if and 
only if i?"(a) = {a + na} = {(ra + l)a} £ J. 
Fix 

(l-a)/3<a'< (l-a)/2 

and put 

h = [1 - a, 1 - a') and J 2 = [1 - a', 1). 
Since a' < (1 — a)/2 it follows that a' < a. Also for j = 1,2 set 

^ = {ra £ N|iT(a) £ Ij}. 

Thus Ai, v4 2 partitions the set f | . We now show that f | is not an IP-set by showing that the sum 
of any three elements of A 1 belongs to f | and that the sum of any two elements of A 2 belongs to 

f lo- 

Now take any m, n 2 , £ A\ and set 

Xi = {(ni + l)a}, x 2 = {{n 2 + l)a}, x 3 = {(ra 3 + l)a}. 
Then xi, x 2 , x 3 £ [1 — a, 1 — a') and n\ + n 2 + ra 3 corresponds to the point 

{(rai + n 2 + ra 3 + l)a} = {xi + x 2 + x 3 - 2a}. 
Since x±, x 2 , x 3 £ [1 — a, 1 — a'), we have 





{x\ + x 2 - 


\-x 3 - 


2a} £ [{3-5a},{3- 


Since «' > ^ 


it follows that 


2 


— 3a — 2a < 1 — a, 


and hence 




{2 


— 3a' — 2a} < 1 — a, 


which gives 




{3 


— 3a' — 2a} < 1 — a 


as required. 









3a' -2a}). 



Similarly take any nx,n 2 £ A 2 . Set 



#i = {{n>i + l)a}, x 2 = {(ra 2 + l)a} 
so that xi, x 2 £ [1 — a', 1). Then rai + ra 2 corresponds to the point 

{(rai + ra 2 + l)a} = {xi + x 2 - a}. 
Since x-i, x 2 £ [1 — a', 1), we have 

{xi + x 2 — a} £ [{2 — 2a' — a}, 1 — a). 

Since 



it follows that 

{1 - 2a' - a} > 0, 
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and hence 

{2 - 2a! - a} > 0. 

The above arguments may be generalized to show that f | is an IP* -set for every prefix u of f . 

In contrast, let us consider the sets g| and g^ where g = Of = 001001010010010 .... Thus, 

g| = {n G N | g n = 0} = {0} U {n > 1 | f n _ t = 0}. 

Consider the sequence (y n ) ne N defined by y n = F 2n +2- It is readily verified that Z(y n — 1) = 
(10) n+1 and hence each y n belongs to g| . Now fix A G Fin(N). Since the Zeckendorff represen- 
tation of J2n&A Vn en ds in 10 2m+2 where m = mm(A), it follows that Z(J2neA Vn ~ 1) en( ^ s m 
(10) m+1 , and hence J2 n£A y n G g| - Thus, g| is an IP-set. Similarly, it is readily verified that 
for each A G Fin(N), we have that J2 n£A x n G g| where x n = F 2n+ \. Thus this time we obtain 
the Sturmian decomposition N = g| Ug| 1 in which both sets g| and g| x are IP-sets, and hence 
central sets. In this case, neither g| nor g^ is an IP*-set. Once again, these arguments may be 
extended to show that both g | Qu and g | lu are central sets for any prefix u of f and hence neither set 
is an IP* -set. 

In summary, by Theorem 13 .121 we have: 

Proposition 4.1. Let f denote the Fibonacci word. Then for every prefix uofi the set f | is an 
IP* -set (and hence a central* set). Setting g = Of we have that for every prefix uofi the sets g| Qu 
and g| are both IP-sets (resp. central sets). 

4.2. The m-bonacci word. The above analysis extends more generally to the so-called m-bonacci 
word. Fix a positive integer m > 2, and let t = totit 2 ... G {0, 1, . . . , m — 1} N denote the m- 
bonacci infinite word fixed by the substitution 

a m : {0, 1, . . . , m - 1} ->■ {0, 1, . . . , m - 1}* 



given by 



0(z + 1) for < i < m - 1 
for i = m — 1 



Using the associated Dumont-Thomas numeration system, we will show: 
Proposition 4.2. Let m > 2, and consider the partition ofN given by 

n= U §L 

0<fc<m-l 

where g = Ot G {0, 1, . . . , m — 1} N . Then for each < k < m — 1 the set g| is an IP-set (resp. 
central set). 

The proof is a simple extension of the ideas outlined above in the case of the Fibonacci word. 
For each m > 2, we define the m-bonacci numbers by Tk = 2 k for < k < m — 1 and Tk = 
Tfc_i + Tk-2 + • • • + Tk-m for k > m. When m = 2, these are the usual Fibonacci numbers. 
Each positive integer n may be written in one or more ways in the form n = Yli=i tiTk~i where 
U G {0, 1} and t x = 1. By applying the greedy algorithm, one obtains a representation of n 
of the form w = t x t 2 ■ ■ -t k with the property that w does not contain m consecutive l's. Such 
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a representation of n is necessarily unique and is called the m-Zeckendorff representation of n, 
denoted Z m {n) (see ED). Thus Z m (T n ) = 10 n for n > 0. 

Proof. Fix < A; < m — 1. We will show that the set g , is an IP-set. It is well known that t n = k 
if and only if Z m (n) ends in 01 fe . Hence 

g|, = {n G N| g n = k} = {n G N\ £ n _i = fc} = {n G N | - 1) ends in01 fe }. 

Consider the sequence (x n )neN given by x n = T mn+ u. It is readily verified for any finite subset 
A C N, the m-Zeckendorff representation of the finite sum s = J2 n &A Xn en( ^ s m 10 mr+fe where 
r = min(v4) and hence the m-Zeckendorff representation of s — 1 ends in (l m ~ 1 0) r l fc and hence 
s G g|, as required. 

Having established that each of the sets g| is a central set (for < k < m — 1), it follows that no 
g| fe is an IP* -set. 

□ 

As an immediate consequence of Proposition 14.21 we have: 

Corollary 4.3. For each positive integer r there exists a partition N = A\ U A2 U • • • U A r in which 
each Ai is a central set. 

Proof. For each 1 < k < r, it suffices to take Ak = g\ k _ x . □ 

5. STURMIAN PARTITIONS & CENTRAL SETS 

In this section we prove the results announced in section 1 concerning Sturmian partitions of N. 
Throughout this section cu = lu cuilu 2 ■ ■ ■ G {0, 1} N will denote a Sturmian word, T the set of all 
factors of cu, and (fi, T) the subshift generated by u, where T denotes the shift map. We denote by 
a) E f2 the characteristic word. 

Lemma 5.1. Ifu, u', u" G Q are such thatT n °{u) = T n °(u') = T n °{u"), then Card{u, to', u"} < 
2. 

Proof. This follows immediately from the fact that contains a unique characteristic word and 
that this word is aperiodic. □ 

We will make use of the following key lemma which essentially says that two distinct Sturmian 
words cu and to' are proximal if and only if T n (u) = T n (cu') = Co for some n > 1. 

Lemma 5.2. Let to and to' be distinct elements ofVt. Then either T n (to) = T n (to') = to for some 
n > 1, or there exists N > such that to n to n+ i . . . to n+ N 7^ co' n uo' n+l . . . to' n+N for every n G N. 

Proof. We will use a definition of Sturmian words via rotations, which we recalled in Section 2. 
Notice that Co = s a Q = s' a a , and singular words correspond to the case when the orbit of a point 
under rotation map goes through the point a. If s a)P is non-singular, then s a)P = s'. Ifw^w' are 
singular words defined by rotations of the same point, i. e., w = s ajP , w' = s' a p , then they differ 
only when they pass through 1 — a and 0, i. e., in maximum two points, so there exists n > 1 
such that T no (to) = T n °(to') = to. 

Now consider the case when w, w' are defined by rotations of two different points p, p', < p < 
p' < 1. To be definite, let us consider the interval exchange of Io and I\ for both w and w'. We 
should prove that there there exists N > such that 

U nWn+l • • • U n+N 7^ to' n to' n+l . . . to' n+N 
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for every n G N. We have w. t ^ w\ if and only if Wi G I , w[ G I\ or Wi G I\, w\ G I . This 
condition is equivalent to 

wt G [1 - a - {p' - p), 1 - a) U [1 - (p' - p), 1). 

The distribution of points from the orbit of any point 9 under rotation by a is dense, it means 
that for every e there exists N(e), such that after N(e) iterations points split the interval [0, 1) into 
intervals of length less than e. Putting e = p' — p, we get that every N = N(e) consecutive iterations 
there will be a point in every interval of length p' — p, so there are points in [1 — a — (p' — p) , 1 — a) 
and [1 — (p' — p), 1) every N iterations, and hence for every n there exists i G [n, n + N — 1] with 
Wi ^ w[. 

□ 

We first consider the case of nonsingular Sturmian words: 

Lemma 5.3. Let u G {0, 1} N be a nonsingular Sturmian word andp G f3N an idempotent ultrafil- 
ter. Then p*(u) = oj. 

Proof. Suppose to the contrary that p*(co) ^ to. Then since to is nonsingular, Lemma [521 implies 
that for all sufficiently long factors u of u, we have that ui\ u PI p*(uj)\ u = 0. But, by Lemma IX9l 
we have p*(p*(co)) = p*(co), that is the image under p* of u and p*(co) coincides. It follows by 
definition of p* that for every prefix u of p*(to) we have u\ G p and p*{uj)\ u G p and hence 
oj\ f]p*(u)\ G p, a contradiction. □ 

Theorem 5.4. Let u G f2 be a nonsingular Sturmian word, and u a factor ofu. Then lu\ u is an 
IP-set (resp. central set) if and only ifu is a prefix ofu. Hence for every prefix v ofu and n G u\ v 
the set u\ — n is an IP* -set (resp. central* set). 

Proof. Let cu be a nonsingular Sturmian word, u a prefix of u, and p G (3N an idempotent ultrafilter. 
Then by Lemma [531 m is a prefix of p*(u) and hence u\ u G p. Thus for each prefix u of u the set 
u\ u belongs to every idempotent ultrafilter and hence is an IP*-set. It follows that if v G F is 
not a prefix of u, then u\ is not an IP-set. Finally, let v be any factor of u and n G N. Then 
u\ v — n = T n (oj) | . If n G u\ , then v is a prefix of T n (u) from which it follows that 

uj\ -n = T n (u)\ . = T n (u)\ ep. 

Hence u\ — n is an IP*-set □ 

As a consequence of the above theorem we have 

Corollary 5.5. Let u and u' be two nonsingular Sturmian words, not necessarily of the same slope. 
Then for every prefix u of u and every prefix u' of u' we have that u\ D u'\ , is an IP* -set (resp. 
central* set), in particular the intersection is infinite. 

We note that the assumption that u and u' be nonsingular is necessary, as for example if we 
consider u = Of and u' — If with f the Fibonacci word, then u\ Q fl u'\ x = {0}. 

Proof. Let u and u' be two nonsingular Sturmian words, u a prefix of u, u' a prefix of u', and 
p G /3N an idempotent ultrafilter. Then by Corollary Q] we have that u\ G p and u | , G p and 
hence u\ fl o;| , G p. Thus u\ fl o;| , belongs to every idempotent and hence is an IP*-set. □ 

We next consider singular Sturmian words. 
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Lemma 5.6. Let u,u' E Q be distinct Sturmian words such that T n °(io) = T n °(to') = to for some 
no > 1. Then for every u G T and every non-principal ultrafilterp G f3N we have 

L)\ G V •<=>■ G V. 

I u 1 I u 

In particular, p*(io) = p*(to'). 

Proof. Since p is a non-principal ultrafilter, we have that oo\ u G p ui\ n [AT, +oo) G p for all 
N >\. Similarly co'\ E p -<=r- co'\ D [A 7 ", +oo) G p for all A 7 " > 1. But for each u E J 7 , we have 
fl [no, +oo) = oj'\ n [no, +oo). The result now follows. □ 

Lemma 5.7. Let to, to' E Q be as in the previous lemma, and let p E /3N be an idempotent 
ultrafilter. Thenp*(to) = p*(to') E {oo,co'}. 

Proof. That p* (to) = p* (a/) follows from the previous lemma and the fact that idempotent ultrafil- 
ters are non-principal (see for instance 10). By Lemma 13 .101 p* commutes with the shift map T, 
and hence 

T n °p*(u) = p*(T n °to) = p*(£o) = Co 
where the last equality follows from Lemma [531 By Lemma I5T1 applied to to" = p*(to) it follows 
that p* (to) = to orp*(oj) = oj'. □ 

Theorem 5.8. Let u E VL be a Sturmian word such that T n °(co) = to with n > 1. Then to\ u is an 
IP-set (or central set) if and only if either u is a prefix of to or a prefix of to 1 where to 1 is the unique 
other element ofQ with T n ° (to') = to. 

Proof. Let to E f2 and n be as in the statement of the theorem. Then there exists a unique to' E Q 
with to' to and with T n °(to') = to. Suppose that co\ is an IP-set for some u E F. Then by 
Lemma [3771 it follows that u is a prefix of p*(to) for some idempotent ultrafilterp G (3N. It follows 
from Lemma 15771 that u is a prefix of cu or a prefix of to'. This proves one direction. 

To establish the other direction, we must show that to | is a central set for each prefix u of to or of 
to' . By Theorem 13.1 11 there exist minimal idempotent ultrafilters pi, p 2 E (3N such that p\(u)) = to 
and pl(to) = to'. The result now follows. □ 

Remark 5.9. V. Bergelson J9l suggested to us that the above result may be related to a previously 
known partition of N into two central sets X = {[mx],m E N} and Y = {[my], m E N}, where 
x and y are two irrational numbers satisfying 1/x + 1/y = 1. In fact, this partition precisely 
corresponds to our partition of N into two IP-sets to\ Q and co\ 1 where to is of the form 0w and to is 
a characteristic Sturmian. 

This could be seen using the definition of Sturmian words via mechanical words (see Section 2 
for notation). For a slope a we have s Oi0 = Oto. Let a = 1/x and 1/y = 1 — a; then s Qj o(n) = 1 if 
and only if there exists an integer k such that a(n + 1) > k and an < k. It is easy to see that this 
pair of equations is equivalent to n < kx < n + 1, which implies n E X. We have s at0 (n) = if 
and only if there exists an integer k such that a(n + 1) < k + 1 and an > k. It is not difficult to 
see that this pair of equations is equivalent to n < (n — k)y < n + 1, which implies n E Y. 

Remark 5.10. We are unable to extend the results on Sturmian partitions to all Arnoux-Rauzy 
words. In fact, our proof of Lemma [5721 relies on the geometric interpretation of Sturmian words 
as codings of orbits under an irrational rotation on the circle. It was shown in lfT2l that there exist 
Arnoux-Rauzy words which are not measure-theoretically conjugate to a rotation on the n-torus. 
In this case, we do not understand for which pairs of Arnoux-Rauzy words in the subshift are 
proximal. 
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6. Other central partitions defined by substitutions 

We begin by briefly reviewing some notions from topological dynamics in the framework of 
minimal subshifts (X, T) which will be used in the proof of Theorem |4] For this we consider 
two-sided subshifts (X, T) meaning that X C A z . So points in X are bi-infinite words. A subshift 
(X, T) is said to be equicontinuous if for every e > 0, there exists a 5 > 0, such that for all 
x, y G X, if d(x, y) < 5 then d(T n (x),T n (y)) < e for every n G Z. A subshift (Y, T) is called a 
factor of (X, T) if there exists a continuous surjection 

7T : X -> Y 

which commutes with the shift map T. It is well known (for instance by way of Zorn's lemma) that 
every subshift (X, T) has a maximal equicontinuous factor (Y, T) i.e., (Y, T) is an equicontinuous 
factor of (X, T) and any equicontinuous factor (Z,T) of (X,T) is also a factor of (Y,T). It is 
also well known that if ir : X — > Y is the maximal equicontinuous factor, then for any two points 
x, y G X we have that ir(x) = 7r(y) if and only if x and y are regionally proximal (see ). 

Proof of Theorem\4\ Let us fix positive integers r and JV. Consider the constant length substitution 

r:{l,2,...,r}^{l,2,...,r} + 

given by 1 i — y 123 • ■ • r, 2 h-> 23 • • • rl, 3 v-t 34 • • • rl2, . . . , r n- rl2 ■ • • r — 1. In case r = 2 
we have the Thue-Morse substitution on the alphabet {1,2}. For 1 < i < r, let denote the zth 
fixed point of r beginning in the letter i. As in the case of Thue-Morse, for i ^ j the words 
and never coincide, i.e., x$ ^ Xn for each n G N. Let (X, T) denote the one-sided minimal 
subshift generated by the primitive substitution r. We will now show that each of the fixed points 
x^> is distal. 

Lemma 6.1. Let x denote any one of the fixed points a?W of the substitution r above. Then x is 
distal. In particular, the two fixed points of the Thue-Morse substitution are each distal. 

Proof. Let (X,T) denote the two-sided subshift generated by r, and let ix : X — > Y denote the 
maximal equicontinuous factor. The substitution r above is of Pisot type, in fact, the dilation of r 
is r and all other eigenvalues are equal to 0. (Note that r is not an irreducible substitution). In J3]|, 
V. Baker, M. Barge and J. Kwapisz show that for a primitive substitution of Pisot type (irreducible 
or not), the mapping onto the maximal equicontinuous factor is finite to one. Thus there exists 
a constant C such that for any z G X, there are at most C points z' G X which are regionally 
proximal to z In particular, for any z G X, there are at most C points z' G X which are proximal 
to z. 

Now suppose y G X is proximal to x. We will show that y = x. It is easy to see that the 
bi-infinite word z = xrev ■ x E X where xrev denotes the reversal or mirror image of x, and 
where ■ denotes the origin. Similarly, let y' denote a left infinite word such that the concatenation 
z 1 — y' ■ y G X. Since x and y are proximal, it follows that z and z' are proximal. Set a = r r . 
Since r, and hence er, are of constant length, it follows that cr(z') is proximal to cr(z). But a(z) = z. 
Hence (o~ n (z')) n >o defines an infinite sequence of points in X each of which is proximal to z, and 
which in the limit tends to x r g V • x^ where % is the first (meaning rightmost) letter of y' and j is the 
first letter of y. But since there are only finitely many points in X which are proximal to z it follows 
that a n (z') = x r g V ■ x^ for some n > 0. Hence by de-substituting we obtain z' = x r g V ■ x^ from 
which it follows that y = x^' . Thus both x and y are fixed points of r which are assumed proximal. 
It follows that y = x and hence x is distal as required. □ 
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Put x = x^ l \ Since x is distal, so is T n (x) for each n > 1. On the other hand, it is easy 
to see that for each positive integer n we have u^[n]x E X, where u"[n] denotes the reversal 
of the prefix of i« of length n. Thus the r words {u^[n}x,u^[n}x, . . . ,u^[n]x} are pairwise 
proximal and each begin in distinct letters (this is because the fixed points never coincide). Finally 
let uj = [N + l]x, and set Aj = u\. for each 1 < i < r. Then each A. is a central set. For each 
1 < n < N, we have that A; — n = T n {u)\ . = u^'[N + 1 — n]x\. is a central set. But for k > 1, 
we have that A; — (N + k) = T fc_1 (x) | . which is a central set if and only if T h ~ 1 (x) begins in i. 

□ 

Proof of Theorem\5\ Fix a positive integer r. Let r be a primitive substitution whose associated 
subshift Vt is topologically weak mixing. For instance we may take the substitution 001 and 
1 11001 orO h4 001 and 1 i-> 11100 (see O)- Let w G fi. Fix m such that p w (m) > r, 
and put s = Poj(m). Let «i, w 2 , • • • , u s denote the factors of lu of length m. As pointed out to us 
by V. Bergelson and Y. Son [9], the weak mixing implies that the set of points in f2 proximal to 
to is dense in f2 (see for instance page 184 of Il22l0 . Thus for each factor Ui there exists a word 
Xi E f2 beginning in ui and which is proximal to u. Hence by Theorem 13.111 there exists a minimal 
idempotent ultrafilter E (3N such that p*(co) = Xj. Hence for each 1 < i < s we have that 
u)\ E Pi and hence cu\ is a central set. Finally, for each positive integer n and for each 1 < i < s 
we have that 

u \ -n = T n (u)\ . 

Again the weak mixing implies that there exists a word x E beginning in Ui and proximal to 
T n {uS). Hence there exists a minimal idempotent p E (3N such that p*(T n (cu)) = x from which it 
follows that uj\ — n E p and hence uj\ — n is a central set. Thus we obtain a partition of N 



lUi 

i=l 

into s-many central sets and for each positive integer n and 1 < i < s we have that cu\ — n is 
again a central set. Thus, setting 

Ai — U!\ 

\Ui 

for i = 1, . . . , r — 1, and 

A- = U < 

i=r— 1 

we obtain the desired partition of N. □ 

7. Infinite central partitions of N 

In this section we construct infinite partitions of N into central sets by using words on an infinite 
alphabet and prove Theorem [61 Our construction makes use of the notion of iterated palindromic 
closure operator (first introduced in lfT6lO : 

Definition 7.1. The iterated palindromic operator ip is defined inductively as follows: 

• ip(e) = e, 

• For any word w and any letter a, ip(wa) = (ip(w)a)( + \ 

We denote with the right palindromic closure of the word w, i.e., the shortest palindrome 
which has w as a prefix. 
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For example, ip(aaba) = aabaaabaa. The operator ip has been extensively studied for its central 
role in constructing standard Sturmian and episturmian words. It follows immediately from the 
definition that if u is a prefix of v, then ip(u) is a prefix of ip(v). Thus, given an infinite word 
oj = uqu^ ... on the alphabet A we can define 

ijj{oS) = lim ip(uoUiU2 ■ • • oj n ). 

n— ¥00 

The following lemma summarizes the properties of ip needed. 

Lemma 7.2. Let A be a right infinite word over the (finite or infinite) alphabet A and let oj = ^(A). 
Then the following statements hold: 

(1) The word oj is closed under reversal, i.e., if v = v x v 2 ■ ■ .v k is a factor ofoj, then so is its 
mirror image v k . . . v%Vi. 

(2) The word oj is uniformly recurrent. 

(3) If each letter a £ A appears in A an infinite number of times, then for each prefix uofoj 
and each a £ A, we have au is a factor ofoj. 

Proof. Since any factor of oj is contained in some ijj(v) for a sufficiently long prefix v of A, and 
4>{v) is by definition a palindrome (and hence closed under reversal), the first statement is proved. 
The second statement is easily derived from the fact that for any finite prefix va of A (a being a 
letter), we have that |^(ua)| < 2|'0(v)| + 1 and moreover ip{va) begins and ends in ip(v). It follows 
that any factor of length (for example) 3\ip(v) | contains an occurrence of ip(v). 

Finally suppose each a £ A appears infinitely many times in A. Thus for any letter a and any 
prefix v of A there exists a prefix of A of the form vv'a. From the definition of ip we then have 
that ip(vv')a is a prefix of oj and tp(vv') ends in ij)(v), so ij)(v)a is a factor of oj. Since ij){v) is a 
palindrome and oj is closed under reversal, we obtain that for any prefix v of A and for any letter 
a, the word aip(v) is a factor of oj and the third statement easily follows. □ 

With the preceding Lemma, we are now able to construct infinite partitions of N such that each 
element of the partition is an IP-set. 

Proposition 7.3. Let oj = ip(A) where A is a right infinite word on an infinite alphabet A with the 
property that each letter a £ A occurs in A an infinite number of times. Then, for any a £ A, the 
set aoj\ a is a central set, thus {oj\ a + l}ae.4 w an infinite partition ofN into central sets. 

Proof. From 17.21 we clearly have that oj is uniformly recurrent and closed under reversal. Further- 
more, since each a £ A occurs in A an infinite number of times, by © of the same lemma we also 
obtain that condition © holds, so that for any letter a, the set of factors of au coincides with that 
of oj. From this and from the uniform recurrence of oj, we have that au is uniformly recurrent as 
well. Let us denote by 7i a the image of oj under the morphism p a defined as follows: 

• n a (a) = 0, 

• /i a (x) — 1 if x ^ a. 

Since au is uniformly recurrent for any a, it is clear that also 07r a is uniformly recurrent for any 
a. From Theorem 13.1 11 we then have that for any a there exists a minimal idempotent ultrafilter 
p a such that p*(07r a ) = 07r a . In particular, this means, by Lemma [3771 that 07r a | (which clearly 
coincides with aoj\ a by definition) is a central set for any a. The statement can then be easily 
derived from the fact that aoj I = oj I + 1 . □ 
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8. Strong coincidence condition 

Let r > 2 be a positive integer and set A = {1, 2, . . . , r}. A primitive substitution r : A — > A + 
is said to satisfy the strong coincidence condition if and only if for any pair of fixed points x and 
y, we can write x = sex', and y = toy' for some s, t E A + , c E A, and x', y' E A°° with s t. 
This combinatorial condition, originally due to P. Arnoux and S. Ito, is an extension of a similar 
condition considered by F.M. Dekking in [T4| in the case of constant length substitutions, i.e., when 
\r(a)\ = \r(b)\ for all a, b E A. In this case Dekking proves that the condition is satisfied if and 
only if the associated substitutive subshift has pure discrete spectrum, i.e., is metrically isomorphic 
with translation on a compact Abelian group. Clearly not all primitive substitutions satisfy the 
strong coincidence condition. For instance, it is not satisfied by the Thue-Morse substitution (in 
fact the two fixed points disagree in each coordinate). It is conjectured however that if r is an 
irreducible primitive substitution of Pisot type, then r satisfies the strong coincidence condition. 
M. Barge and B. Diamond established this conjecture for binary primitive substitutions of Pisot 
type [4]. Otherwise the conjecture remains open for substitutions on alphabets greater that two. 
Substitutions of Pisot type provide a framework for non-constant length substitutions in which the 
strong coincidence condition implies pure discrete spectrum. 
As a consequence of Theorem 13.1 II we have 

Corollary 8.1. Let r be a primitive substitution verifying the strong coincidence condition. Then 

(1) Any two fixed points ofr are proximal. 

(2) For any pair of fixed points x and y, there exists a minimal idempotent ultrafilter p E j3N 
with p*(x) = y. 

(3) For any pair of fixed points x and y, and any prefix u ofy, we have that x\ is a central set. 

Remark 8.2. For irreducible primitive substitutions of Pisot type, it turns out that each of the above 
conditions (1), (2), and (3) are equivalent and each implies the strong coincidence condition. A 
proof of this fact will be given in [fTTII . However, for a general primitive substitution we always 
have that (1) -<=>- (2) ==>- (3). The two fixed points of the uniform substitution a H- aaab, 
b H- bbab are proximal but do not satisfy the strong coincidence condition. V. Bergelson and Y. 
Son [9] showed that the fixed points of a i— > aab, b \-t bbaab satisfy (3) but not (1) and (2). 

Proof. Condition (1) is immediate from the definition of strong coincidence. By Theorerr f3.1 II we 
have that (1) implies (2) and hence (3). □ 

We present now an alternative and constructive proof of (3) using the so-called Dumont-Thomas 
numeration systems defined by substitutions |fT71[T8l . Since in the irreducible Pisot case, condition 
(3) alone implies the strong coincidence condition, this method of proof constitutes a new approach 
to the strong coincidence conjecture. We begin with a brief review of these numerations systems. 

8.1. Abstract numeration systems denned by substitutions. Let r denote a substitution on a 
finite alphabet A. For simplicity we assume that r has at least one fixed point x = X0X1X2 ■ ■ ■ 
beginning in some letter a E A. The idea behind the numeration system is quite natural: every 
coordinate x n of the fixed point x is in the image of r of some coordinate x m with m < n. More 
precisely, consider the least positive integer m such that xqX\ ... x n is a prefix of t(xqX\ . . . x m ). 
In this case we can write x Xi . . . x n = t(x Xi . . . x m ^i)u n x n where u n x n is a prefix of r(x m ). 
We now imagine a directed arc from x m to x n labeled u n . In this way every coordinate x n is the 
target of exactly one arc, and the source of |r(x n )|-many arcs. It follows that for each n there is a 
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unique path s from x to x n . Thus every natural number n may be represented by a finite sequence 
of labels itj obtained by reading the labels along the path s in the direction from x to x n . 

More formally, associated to r is a directed graph Q(r) defined as follows: the vertex set of Q{t) 
is the set A. Given any pair of vertices a, b we draw a directed edge from a to b labeled u E A* 
if ub is a prefix of r(a). In other words, for every occurrence of b in r(a) there is a directed edge 
from a to b labeled by the prefix (possibly empty) of r(a) preceding the given occurrence of b. 
Figure 1 depicts the graph Q (t) for the Fibonacci substitution a h-> ab, b n- a. 



a 




Figure 1 . The Fibonacci automaton 

For simplicity, in case some letter b occurs multiple times in r(a), we draw just one directed 
edge from a to b having multiple labels as described above. This is shown in Figure 2 in the case 
of the substitution a >->■ aab, b i->- bbaab. 

aa 




bb , bba 



Figure 2. The automaton of a H- aab, b n- bbaab. 

Let x = X0X1X2 ■ ■ ■ denote the fixed point of r beginning in a. Then the graph G(t) has a 
singleton loop based at a labeled with the empty word e. We consider this to be the empty or Oth 
path at a. More generally by a path at a E A we mean a finite sequence of edge labels u uiu 2 ■ ■ ■ u n 
corresponding to a path in Q(r) originating at vertex a with the condition that u ^ e whenever 
the length of the path n > 0. For example in the case of the Fibonacci substitution, except for the 
path s = e, each path is given by a word in {a, e} beginning in a and not containing the factor aa. 
For each path s = u u 1 u 2 ■ ■ ■ u n set 

P {s) = t^oK-ViK-'M • • • rK_xK 

and A(s) = |p(s)|. In lfT7llT8l it is shown that for each path s at a, the word p(s) is a prefix of the 
fixed point a; at a and conversely for each prefix u of x there is a unique path s at a with p(s) = u. 
This correspondence defines a numeration system in which every natural number / is represented 
by the path s = uqU\U2 • • -u n in G(t) from vertex a to vertex xi corresponding to the prefix of 
length I of x, so that 

(*) I = \(s) = \r n (u )\ + Ir^VOI + \r n ~\u 2 )\ + ■■■ + |rK_i)| + \u n \. 

Generally by the numeration system one means the quantities |r n (u)| for all n > and all 
proper prefixes u of the images under r of the letters of A. Then a proper representation of / in this 
numeration is an expression of the form (*) corresponding to a path s = u uiu 2 ■ ■ ■ u n in G(t). 

In the case of a uniform substitution of length k this corresponds to the usual base /c-expansion 
of /. In the case of the Fibonacci substitution, each u n E {e, a} and UiU i+ i 7^ aa for each < i < 
n — 1. Thus this representation of I is precisely the Zeckendorff representation of / discussed in §4 
in which / is expressed as a sum of distinct Fibonacci numbers via the greedy algorithm. 
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In general, this numeration system not only depends on the substitution r but also on the choice 
of fixed point. For example for the substitution in Figure 2 the number 5 is represented by the path 
a, aa from vertex a or by the path b, e from vertex b. In fact, r(a)aa = aabaa is the prefix of length 
5 of T°°(a) while r(b)e = bbaab is the prefix of length 5 of r°°(6). 

An alternative reformulation is as follows: Given two distinct paths s = u uiu 2 • ■ • u n and 
t = v v\V2 • ■ • v m both starting from the same vertex a, we write s < t if either n < m or if n = m 
there exists i E {0, 2, . . . , n} such that uj = Vj for j < i, and \u\i < \v\i. This defines a total order 
on the set of all paths starting from vertex a. In the case of the Fibonacci substitution, we list the 
paths at a in increasing order 

s, a, ae, aee, aea, aeee, aeea, aeae, aeeee, . . . 

Thus there is an order preserving correspondence between 0,1,2,3,... and the set of all paths at a 
ordered in increasing order. 

While these numeration systems are very natural and simple to define, they are typically ex- 
tremely difficult to work with in terms of addition and multiplication. 

Let a and b be distinct vertices in Q(r). We say a path s originating at a is synchronizing relative 
to b if there exists a path s' originating at b having the same terminal vertex as s and with A(s) = 
A(s'). From this point of view the strong coincidence conjecture implies that 

{A(s) | s = a synchronizing path relative to b} 

is a thick set. 

8.2. Proof of (3) in Corollary 18.11 Let r be a primitive substitution satisfying the strong coinci- 
dence condition. Suppose x and y are fixed points of r beginning in a and b respectively. Then we 
can write x = sex', and y = toy' for some s, t E A + , c E A, and x',y' E A°° with s t. By 
replacing r by a sufficiently large power of r, we can assume that 

• sc is a prefix of r(a), 

• tc is a prefix of r(b), 

• b occurs in r(c). 




e 



Figure 3. Vertices a, b, c of Q(r) 

Thus in Q{t) there is a directed edge from a to c labeled s, a directed edge from b to c labeled t, 
and a directed edge from c to b labeled r for some prefix r of r(c). See Figure 3. 
We now define a sequence of paths (pi)i>o from a to b by 
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Pi S,T, £i £, . . . , £ ■ 
2i 

Put n« = A(pi). Then clearly {rij | 2 > 0} C x We now show that any finite sum of distinct 
elements from the set {rij | i > 0} is contained in xL Set 

Q'j t, 7™, £,£,...,£ . 

2i 

Then each is a path from 6 to b and since s and t are Abelian equivalent it follows that \{pi) = 
\(qi). Fix k > 1 and choose ii < z 2 < ■ ■ • < %k- Then 

k k-l 

X)A(p < .) = A(p i J + 53A(p< J .) 

fc-i 

= A(p ifc )+X) A (fti) 
i=i 

fc-i 

= |r 2lfc+1 (s)| + |r 2lfc (r)| + ^(|r 2 ^ +1 (t)| + \r 2i >(r)\) 

3=1 

= |T 2ifc+1 (s)T 2ife (r)T 2ife - 1+1 (t)T 2ife - 1 (r)T 2ifc - 2+1 (t)r 2i '=- 2 (r) • ■ ■ r 2n+1 (t)r 2il (r)| 

which is represented by a path in Q(r) from a to 6 and hence corresponds to an occurrence of b 
in x. This shows that x\, is an IP-set, and hence by Theorem 13 . 1 21 a: | ^ is a central set. A similar 
argument applies for any prefix u of y by defining the paths pi by 

Pi S, 7") £,£,...) £ 

with Ni sufficiently large. 



References 

[1] P. Arnoux and G. Rauzy, Representation geometrique de suites de complexite 2n + 1, Bull. Soc. Math. France 
119(1991), 199-215. 

[2] J. Auslander, Minimal flows and their extensions, North-Holland Mathematical Studies, vol 153, North-Holland 
1988. 

[3] V. Baker, M. Barge and J. Kwapisz, Geometric realization and coincidence for reducible non-unimodular Pisot 
tiling spaces with an application to (3-shifts, Numeration, pavages, substitutions, Ann. Inst. Fourier (Grenoble) 
56 No. 7 (2006), p. 2213-2248. 

[4] M. Barge and B. Diamond, Coincidence for substitutions of Pisot type, Bull. Soc. Math. France 130 (2002), p. 
691-626. 

[5] M. Barge and B. Diamond, Proximality in Pisot tiling spaces, Fund. Math. 194 no. 3 (2007), p. 191-238. 

[6] V. Bergelson, Minimal idempotents and ergodic Ramsey theory, Topics in dynamics and ergodic theory, 8 39, 

London Math. Soc. Lecture Note Ser, 310, Cambridge Univ. Press, Cambridge, 2003. 
[7] V. Bergelson and N. Hindman Nonmetrizable topological dynamics and Ramsey Theory Trans. Amer. Math. Soc. 

320 (1990), p. 293-320. 

[8] V. Bergelson, N. Hindman and D. Strauss, Strongly central sets and sets of polynomial returns mod 1, Proc. 

Amer. Math. Soc, to appear. 
[9] V. Bergelson and Y. Son Personal communication. 



28 



M. BUCCI, S. PUZYNINA, AND L.Q. ZAMBONI 



[10] A. Blass, Ultrafilters: where topological dynamics = algebra = combinatorics, Topology Proc. 18 (1993), p. 
33-56. 

[11] M. Bucci, S. Puzynina and L.Q. Zamboni Words, Numerations and the Stone-Cech compactification ofN To 
appear in a forthcoming book "Recent Mathematical Developments in Aperiodic Order" edited by J. Kellendonk, 
D. Lenz and J. Savinien. 

[12] J. Cassaigne, S. Ferenczi, and L.Q. Zamboni, Imbalances in Arnoux-Rauzy sequences, Ann. Inst. Fourier (Greno- 
ble), 50 (2000), no. 4 p. 1265-1276. 

[13] D. De, N. Hindman and D. Strauss, A New and Stronger Central Sets Theorem, Fundamenta Mathematicae 199 
(2008), p. 155-175. 

[14] FM. Dekking The spectrum of dynamical systems arising from substitutions of constant length, Z. Wahrschein- 

lichkeitstheorie und Verw. Gebiete, 41 (1977/1978), p. 221-239. 
[15] FM. Dekking andM. Keane Mixing properties of substitutions, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 

42 (1978), p. 23-33. 

[16] A. de Luca, Sturmian words: structure, combinatorics, and their arithmetics, Theoret. Comput. Sci. 183 (1997), 
p. 45-82. 

[17] J.-M. Dumont and A. Thomas, Systemes de numeration et fonctions fractales relatifs aux substitutions, Theoret. 

Comput. Sci., 65 (2) (1989), p. 153-169. 
[18] J.-M. Dumont and A. Thomas, Digital sum moments and substitutions, Acta Arith., 64 (1993), p. 205-225. 
[19] M. Edson and L.Q. Zamboni, On the number of partitions of an integer in the m-bonacci base, Numeration, 

pavages, substitutions. Ann. Inst. Fourier (Grenoble) 56 (2006), no. 7, p. 2271-2283. 
[20] R. Ellis, Distal transformation groups Pac. J. Math. 8 (1958), p. 401-405. 

[21] D. G. Fon-Der-Flaass and A. E. Frid, On periodicity and low complexity of infinite permutations, European J. of 

Combin. 28 (2007), p. 2106-21 14. 
[22] H. Furstenberg, Recurrence in Ergodic Theory and Combinatorial Number Theory, Princeton University Press, 

1981. 

[23] N. Hindman, Finite sums of sequences within cells of a partition of N, J. Combinatorial Theory (Series A) 17 
(1974), p. 1-11. 

[24] N. Hindman, Ultrafilters and Ramsey theory-an update Set theory and its Applications (J. Steprans & S. Watson 

eds.) Lecture Notes in Mathematics 1401, Springer- Verlag, 1989, p. 97-1 18. 
[25] N. Hindman, I. Leader and D. Strauss, Infinite partition regular matrices: solutions in central sets, Trans. Amer. 

Math. Soc. 355 (2003), p. 1213-1235. 
[26] N. Hindman and D. Strauss, Algebra in the Stone-Cech compactification. Theory and applications, de Gruyter 

Expositions in Mathematics 27 Walter de Gruyter & Co., Berlin, 1998. 
[27] N. Hindman and D. Strauss, A simple characterization of sets satisfying the Central Sets Theorem, New York J. 

Math 15 (2009), p. 405-413. 

[28] T. Kamae, Uniform sets and super-stationary sets over general alphabets, Ergod. Th. & Dynam. Sys. (DOI 

10.1017/S014338571000074X), to appear. 
[29] T. Kamae, Behavior of various complexity functions, preprint 201 1. 

[30] M. Lothaire, Algebraic Combinatorics on Words Cambridge UK, Cambridge University Press, 2002. 
[31] M. Morse and GA. Hedlund. Symbolic Dynamics II: Sturmian trajectories, Amer. J. Math., 62 (1) (1940), p. 
1-42. 

[32] E. Zeckendorff, Representation des nombres naturels par une somme de nombres de Fibonacci ou de nombres 
de Lucas, Bull. Soc. Royale Sci. Liege, 42 (1972), p. 179-182. 



CENTRAL SETS DEFINED BY WORDS OF LOW FACTOR COMPLEXITY 



29 



Department of Mathematics, FUNDIM, University of Turku, FIN-20014 Turku, Finland. 
E-mail address: michelangelo . bucci@utu . f i , micbucciSunina . it 

Department of Mathematics, FUNDIM, University of Turku, FIN-20014 Turku, Finland. Also, 
Sobolev Institute of Mathematics, 4 Acad. Koptyug avenue, 630090 Novosibirsk Russia 
E-mail address: svetlana . puzyina@utu . f i 

Universite de Lyon, Universite Lyon 1, CNRS UMR 5208, Institut Camille Jordan, 43 boule- 
vard DU 11 NOVEMBRE 1918, F69622 VlLLEURBANNE CEDEX, FRANCE. ALSO, DEPARTMENT OF MATHE- 
MATICS, FUNDIM, University of Turku, FIN-20014 Turku, Finland. 

E-mail address: zambonigmath . univ-lyonl . f r , luca . zamboni@utu . f i 



